Ord i Dag: Mining Norwegian Daily Newswire

نویسندگان

  • Unni Cathrine Eiken
  • Anja Therese Liseth
  • Hans Friedrich Witschel
  • Matthias Richter
  • Christian Biemann
چکیده

We present Ord i Dag, a new service that displays today's most important keywords. These are extracted fully automatically from Norwegian online newspapers. Describing the complete process, we provide an entirely disclosed method for media monitoring and news summarization. For keyword extraction, a reference corpus serves as background about average language use, which is contrasted with the current day’s word frequencies. Having detected the most prominent keywords of a day, we introduce several ways of grouping and displaying them in intuitive ways. A discussion about possible applications concludes. Up to now, the service is available for Norwegian and German. As only some shallow language-specific processing is needed, it can easily be set up for other languages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatisk splitting av sammensatte ord-et lingvistisk hjelpemiddel for tekstsøking (Automatic splitting of compound words-A linguistic aid for text search) [In Norwegian]

Sammensatte ord skaper problemer ved ulike former for automatisk analyse av vokabularet i en tekst, f.eks, ved frekvensstudier. Problemet består i at menings­ innholdet i et sammensatt ord i mange tilfeller også kan beskrives i et uttrykk med de tilsvarende usammen­ satte ordene. I tekstsøking kan f.eks, de sammensatte ordene føre til at man ikke finner de dokumentene man søker etter fordi det ...

متن کامل

Dag Haug: med - the syntax and semantics of concomitance in Norwegian

The paper discusses the syntax and semantics of the Norwegian preposition med, which denotes a variety of concomitance relations. The c-structural, f-structural and semantic properties of the preposition are examined. Special emphasis is put on the constructions which involve syntax-semantics mismatches, such as bare noun phrases denoting sets of states instead of sets of individuals and it is ...

متن کامل

A Novel Web Usage Mining Method - Mining and Clustering of DAG Access Patterns Considering Page Browsing Time

In this paper, we propose a novel method to analyze web access logs. The proposed method defines a web access pattern as a DAG with page browsing time, and extracts the patterns using the closed frequent embedded DAG mining algorithm, DIGDAG. The proposed method succeeds in extracting as small number of patterns as necessary minimum, and enables more efficient analysis by clustering the extract...

متن کامل

An optimal shape encoding scheme using skeleton decomposition

AbslracCThis paper presents an operational ratedistortion (ORD) optimal approach for skeleton-based boundary encoding. The boundary information is f i t decomposed into skeleton and distance signah by which a more efficient representation of the original boundary results. Curves of arbitrary order are utilized for approximating the skeleton and distance signals. For a given bit budget for a vid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006